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The Degrees of Freedom of MIMO 
Interference Channels without State 
Information at Transmitters 

Yan Zhu and Dongning Guo 

Abstract 

This paper fully determines the degree-of-freedom (DoF) region of two-user interference channels 
with arbitrary number of transmit and receive antennas in the case of isotropic and independent (or 
block-wise independent) fading, where the channel state information is available to the receivers but 
not to the transmitters. The result characterizes the capacity region to the first order of the logarithm 
of the signal-to-noise ratio (SNR) in the high-SNR regime. The DoF region is achieved using random 
Gaussian codebooks independent of the channel states, which implies that it is impossible to increase 
the DoF using beamforming and interference alignment in the absence of channel state information at 
the transmitters. 

Index Terms 

Capacity region, channel state information, degree of freedom (DoF), interference channel, isotropic 
fading, multiple antennas, multiple-input multiple-output (MIMO) channel, wireless networks. 

I. Introduction 

The interference channel is one of the most important models for the physical layer of wireless 
networks. Some recent breakthroughs in understanding the fundamental limits of such channels, 
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Fig. 1. A two-user MIMO interference channel. 



with or without multiple antennas are reported in ((TJ-Q- Most existing studies of interference 
channels assume that full channel state information (CSI) is available to all transmitters and 
receivers. In practice, however, the state of the channel is usually measured at the receivers, and 
it is often difficult for the transmitters to acquire the CSI accurately in a timely manner. 

This paper studies a two-user multiple-input multiple-output (MIMO) interference channel 
subject to isotropic fading, where the channel state is independent over time, and its realization 
is known to the receivers but not to the transmitters. The channel model is described in Section [ill 
An example of the channel is illustrated in Fig. [T] The degree-of-freedom (DoF) region of the 
MIMO interference channel is completely characterized by Theorem [T] in Section III This is the 
main result in this paper. The result indicates that without CSI at the transmitters (CSIT), no 
additional gains in terms of DoF can be achieved using beamforming or interference alignment, 
which is in contrast to the results for the case with full CSI shown in (6j. A detailed proof 
Theorem [T] is developed in Sections III and IV 



Related works [|7|-[fT2l also consider interference channels without CSIT. The case of slow 
fading is modeled as compound interference channels in [|7J, [[8j, where the capacity of a single- 
antenna two-user interference channel is studied in [|7J, and the diversity-multiplex trade-off 
of the same model is studied in (SJ. In the case of fast (independent) fading, Akuiyibo et al 
(9j derived an outer bound of capacity region for two-user MIMO interference channels with 
Rayleigh fading, which is tight in terms of the DoF in some special cases. Tighter outer bounds 



on the DoF region have been developed by Huang et al in [ 10 1, who also assume Rayleigh fading, 
and by Vaze and Varanasi in JTT| , who assume a more general model, and by the authors in [ |T2] , 
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under the assumption of general isotropic fading^ A gap remains between the inner and outer 



bounds in [10|-[12|. A specific example is the case where the two users have one and three 



transmit antennas, and two and four receiver antennas, respectively, as shown in Fig. [T] The DoF 



pair (1, 1) has been shown to be achievable but the best outer bounds in [10|-JT2J includes the 
pair (1,1.5). This paper closes the gap by showing that achievable region of [12] is the exact 
DoF region. In the aforementioned case, the pair (1, 1.5) is not achievable. 

II. Channel Model 

Consider a two-user interference channel, where each transmitter has a dedicated message 
for its intended receiver. Suppose transmitter t is equipped with M t antennas and receiver r is 
equipped with N r antennas for t,r = 1,2. The signals received in the 2-th interval by the two 
users can be described asj3 

y[i] = H u [i]w[i] + JETi 2 [i]aj[i] + u x [i] (la) 

z[i\ = H 2l [i]w[i] + JET 22 [i]aj[i] + u 2 [i\ (lb) 

where w(M-y x 1) and x(M 2 x 1) denote the transmitted signals, H rt (N r x M t ) denotes the 
channel from transmitter t to receiver r, and u r (N r x 1) denotes the thermal noise at receiver 
r, which consists of independent identically distributed (i.i.d.) circularly symmetric complex- 
Gaussian (CSCG) random variables of unit variance (denoted by u r ~ £/V(0, In t ))- The noise 
process {ii r [z]} is i.i.d. over time (i = 1,2,...) and independent of the signals and fading 
processes {H r i[i], H r2 [i}}. 

The usual power constraint on all codewords of both users is assumed, i.e., codewords 
(w[l], . . . ,w[n]) and (x[l], . . . ,x[n]) satisfy 

1 71 

- ii^hii 2 < 7 and - n^[«]ii 2 < 7 

n ^— ' n / — ' 

1=1 4 = 1 

where || ■ || stands for the Euclidean norm of a vector (more generally, it denotes the Frobenius 
norm of a matrix). Since the noise processes are normalized, 7 is regarded as the constraint on 
the average transmit signal-to- noise ratios (SNR). 

The fading models of {llj and (12J overlap but neither fully covers the other. Both models include independent Rayleigh 
fading studied in [ 10 1 as a special case. 

2 As a convention, we use bold fonts to denote random variables, random vectors and random matrices, and we use the 
corresponding normal fonts to denote their realizations. 
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The no-CSIT assumption means that the realization of (H r i, H r2 ) is available to receiver r 
only (r = 1,2), whereas the transmitters have no knowledge about the channel matrices except 
for their statistics. The fading process is assumed to be block- wise independent, i.e., the channel 
matrices H rt [i] remain the same in a constant T consecutive time slots and then change to 
independent values in the next block of T slots. The constant T is often referred to as the 



coherent time [13|. Moreover, the coherence blocks of all links are perfectly aligned, meaning 
that the gains of all links change at the same time. In particular, if T = 1, the fading process 
becomes i.i.d. over time. 

The statistics of the fading processes are arbitrary except that all H rt are almost surely of 
full rank, of finite average power, i.e., E,\\H rt \\ 2 < oo, and isotropic in the following sense: 

Definition 1: A complex- valued random matrix G is isotropic if GQ is identically distributed 
as G for every deterministic unitary matrix Q of compatible size. 

We adopt this notion of isotropic fading, which was introduced in JT4J. In the absence of 
CSIT, isotropic fading is a plausible assumption because there is no reason to prefer signaling 
toward any direction to any other one. Furthermore, many important fading models belong to 
this category, including Rayleigh fading studied in (TOj, where the channel matrices consist of 
i.i.d. CSCG entries. 

III. The Main Theorem and Achievability Proof 

A rate pair (_R 1; R 2 ) is said to be achievable if there exist two codebooks of size |~2 niil ] and 
|"2nR 2 "| f or the two userS; respectively, such that the average decoding error at each receiver 
vanishes as the code length n — > oo. The DoF region is defined as^] 



V — \ (di,d 2 ) 3 positive achievable pair (Ri(j), #2(7)) 

withd, = lim f^f ,j = l,2 . 

7->oo log (1 + 7) J 



Evidently, a DoF is essentially the number of single-antenna point-to-point links that provides 
the same rate at high SNRs (6), (l5)0 

'Throughout this paper, the units of information are bits and all logarithms are of base 2. The DoF is of course invariant to 
the units of information. 

4 The generalized degree of freedom (GDoF) proposed in is out of the scope of this paper. 
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Theorem 1: Suppose user 1 has no more receive antennas than user 2, i.e., N x < N 2 . The DoF 
region of channel ([TJ) with full rank isotropic fading consists of all rate pairs (di, d 2 ) satisfying 

< dj < min(M i , Nj) , j = 1, 2 (2a) 

* + mi 1X'!~> - L) < min(M 1 , (2b) 
mm(M 2 , jV 2 J — L 

where 

L = min(Mi + M 2 , iVi) - min(Mi, N{) (3) 

and we use the convention that R = 1. The DoF region in the case of Ni > N 2 is similarly 
determined by symmetry. 

The coherent time T has no bearing on the DoF region. The assumption that all links have 
aligned coherent blocks in model ([TJ) is important, as it prohibits interference alignment over 
each coherence block. In fact, if the direct links and cross links have staggered coherence blocks 
or different block sizes, interference alignment becomes possible [16|, JT7J. This is out of the 
scope of this paper. 



The inequalities (2a) are the single-user bounds for the two users. As we shall see, L can 



be interpreted as the maximum DoF of user 2 without having negative impact on the DoF of 



user 1 . Therefore, pb| ) describes the trade-off between the DoFs of the two users by carefully 
balancing the interference, after L degrees of freedom are guaranteed for user 2. 

The achievability part of Theorem [TJ can be proved by further dividing the parameter space 
(assuming Ni < N 2 without loss of generality) into the following three cases: 



a) M 2 < N\. In this case ( |2b[ ) becomes 

d l + d 2 < min(M 1 + M 2 , N x ) . (4) 



See Fig. 2(a) for an illustration. The DoF pair (di,d 2 ) falls within the intersections of the 
DoF regions of two multiaccess channels (MAC): one formed by the two transmitters and 
receiver 1; and the other formed by the two transmitters and receiver 2. Therefore, the DoF 
region is achievable by letting both users employ independent random Gaussian codebooks 
and transmit common messages only. Since N x < N 2 , receiver 2 can always decode the 
message of user 1 in the high SNR regime. 
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min 



min 




(c) 

Fig. 2. DoF regions for the cases of (a) Ni > Ma, (b) M 2 > JVi, Mi > iVi, and (c) Ma > Ni > Mi. The outer bound 
developed in [ 1 1 — [ 12| agrees with the exact DoF region in cases (a) and (b) but is strictly looser in case (c), where the previous 
outer bound is shown using dashed lines. 



b) M 2 > JVi and M x > N x . In this case L = and (J2bj) becomes 

A + * < ! (5) 

iVi min(M 2 , A^ 2 ) ~ 

The region becomes a triangle as shown in Fig. |2(b)| Since for both j — 1 and j = 2, user 
j can achieve the single-user DoF mm(Mj, Nj) as long as the other user is silent. It is easy 
to see that the DoF pairs (iV^O) and (0, min(M 2 , N 2 )) are achievable. Hence the region 
confined by ([5]) can be achieved by time sharing. 
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c) M 2 > N ± > M 1 . In this case L = N x - M x and (f2bl becomes 



Mi min(M 2 , N 2 ) - iV\ + M 



min(M 2 ,iV 2 ) 
- min(M 2 , N 2 )-N 1 + M 1 ' 



The capacity region becomes a trapezoid, as illustrated in Fig. 2(c) It suffices to show the 
corner points on the dominant face of the region are achievable. Evidently, the DoF pair 
(0, min(M 2 , iV 2 )) can be achievable by activating only user 2. The pair (Mi,N\ — Mi) is 
in fact within the intersection of DoF regions of the two MAC channels described in Case 
(a), which is evidently achievable. 

In all, the achievability part of Theorem [T] has been established. 

Note that for Cases (a) and (b), the DoF region agrees with the previous outer bound developed 

in IJ0J-JT2J. However, for Case (c), the previous outer bound is strictly loose. 

The preceding proof indicates that the DoF region can be achieved either through time-division 

multi-access (TDMA) or by the Han-Kobayashi scheme with common messages only JT8] ]. It 

suffices to use random Gaussian codebooks independent of the fading processes. 

IV. Proof of the Converse of Theorem [T] 

We assume Ni < N 2 throughout this section. We adopt the following notational conven- 
tion. The sequence x[l], . . . , x[n] is denoted by x n or {x} n . For simplicity, let H denote 
(H 11, -Efi2, H 2 i, H22) so that H n denotes all the channel matrices over n time slots. 



A. Fading Statistics Revisited 

To facilitate the proof, we shall modify the assumption on the the fading channel matrices H rt 
in this section without changing the capacity region. Roughly speaking, isotropic fading can be 
decomposed into two independent components: the "amplitude" and the uniformly distributed 
"phase." Precisely, we have the following result: 

Lemma 1: Let G(N x M) be an isotropic random matrix and K = min(M, N). Let a compact 
singular value decomposition (SVD) of G be G = WAV\ with W{N x K), A(K x K) and 
Vi(M x K). Let Q be independent of G and uniform distributed on the set of M x M unitary 
matrices: Q = {Q G C MxM : Q f Q = I M }. Set V = QV X . Then the following properties hold: 
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1) V{Vi = V*V = W'W = Ik, and A is diagonal with non-negative elements; 

2) V is independent of (W, A, Vi) and is uniformly distributed on V = {V e C MxK : 

Vty = I K }; 

3) G and If AF f are identically distributed, denoted by G ~ WAV^. 

Proof: Property 1 is straightforward by the definition of SVD. In particular, both W and 
V\ have orthogonal columns. 

Noting that conditioned on V\ — V\, V — QV\ is uniform on V, we conclude that V uniform 
distributed and independent of (W , A,V\). Hence Property 2 holds. 

By Definition 1, G is identically distributed as GQ, which in turn is identically distributed 
as GQ. Thus Property 3 holds, i.e., G ~ WAV j . U 

The following is a direct consequence of Lemma [TJ 

Corollary 1: Let (G,W, A, V) be defined as in Lemma [TJ Define block-diagonal matrices 
G = diag(G, ...,G),W = diag(W, . . . , W), A = diag(A, . . . , A) and V = diag(V, . . . , V), 
each with T diagonal blocks. Then G ~ WAV} . 

We remark that in general V\ is not independent of (W,A). By scrambling V\ using 
uniformly distributed Q, we obtain V, which is guaranteed to be uniformly distributed and 
independent of (W, A) by Lemma [T] 

From Lemma [TJ we can obtain matrices (W rt , A rt , V rt ) from the compact SVD of H rt , 
which satisfy the three properties given in the lemma. In particular, V rt is uniformly distributed 
and independent of H rt . For every r, t = 1,2, channel matrix H rt is identically distributed as 
W rt A rt Vl t , although they are not equal in general. Since the channel capacity depends only 
on the statistics of the channel state, we can substitute H rt by W rt A rt V\ t in model ([T]) for 
t, r = 1,2 without changing the capacity region. This substitution allows a simple proof of 
the converse part of Theorem [T] Therefore, with slight abuse of notation, we let the channel 
matrices be H rt = W rt A rt V\ t from this point onward. Moreover, we let the decomposition 
(W rt ,A rt , V rt ) be determined by H rt . 



B. Preliminary Results 

We first develop several preliminary results to facilitate the proof. The following theorem, 



proved in Appendix \M is a simple generalization of [ 19 Theorem 3] to vector channels. 
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Theorem 2 (Gaussian input is not too bad): Suppose that w and w are two random M- vectors, 
H(N x M) is a full-rank deterministic matrix, and v is a random N- vector which is independent 
of w and w. We assume that E||it>|| 2 < 7. Then 

l(Hw + v;w) <l(Hw + v;w) + sup l(Ha + Hw;a). (7) 

E||a|| 2 <7 

In particular, if w has distribution CA/"(0, j^I), then 

J (#11; + «;«;) <l(Hw + v;w) + C* (8) 

where 

= min(M, AO log (l + . f ) . (9) 
\ mm(M, i\) J 

Furthermore, for channel model ([T]) and regarding i? 2 i[^H + Ui[i] = v[i\, we have 

l{y n -w n \H n ) <l{y n ;w n \H n ) + nC* (10) 

where 

y[i] = iTii[i]to[i] + H 12 \i]x\i] + (11) 

for i = 1, . . . , n and io[z] ~ CAf(0, 1) are i.i.d. over time (i = 1, 2, . . . ). 

The following lemma, shown in Appendix |Bj puts an upper bound on the change of mutual 
information due to change of the amplitudes. 

Lemma 2: Let A x and A 2 be two M x M diagonal random matrices with strictly positive 
diagonal elements almost surely. Let x denote a random vector and u a CSCG random vector 
with arbitrary covariance, both of dimension M . Assume that x, u and (Ai, A 2 ) are independent. 
Define random matrix A min = min(A!, A 2 ) as the element- wise minimum. Then 

I(A 2 x + u; a;|A 2 ) — I(\ix + u; x\Ai) 
detA 2 \ 
1/ 

1 + 1 



<2Elog ( , 

_ det Aj 

< 2Elog + det A 2 + 2E 



det A n 

where log + (a;) = logmax(l,x). Evidently, if Ai and A 2 are deterministic, the inequalities hold 
with all expectations and conditionings dropped. 

Lemma 3: Let a? be a random vector in C M , Uj ~ CJ\f(0, Ikj), j = 1,2,3, and K\ < K 2 < M. 
In addition, let Vj be a random M x Kj matrix for j = 1,2,3. Suppose that conditioned on 
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V 3 = V 3 , Vj is uniformly distributed on Vj = {V e C MxK ^\V^V = I K . and VW 3 = 0} for 
j = 1,2. Suppose also that x, ui, u 2 , u 3 and V = (Vi,V2,V3) are mutually independent. 
Then 



—2 (v\x + ui; x V\x + u 3 , v) > — 2 (v\x + u 2 ; x 



Vlx + u 3 ,V). 



(13) 



Furthermore, suppose (Vi[z], V 2 [z], V 3 [i])" =1 is i.i.d. following the joint distribution of (Vi, V 2 , V 3 ), 
then 

±-T {{V\x + uj"; aj"| {V\x + u 3 } n , V") > ±-X [{V\x + u 2 } n ; x»| {V\x + u 3 }», V r 



(14) 



In particular, if V 3 = 0, ( [13] ) and ( [14] ) become 

—2 fvjcc + ui; a; Vi) > —2 f V 2 ;c + u 2 ; x 

K\ V / K 2 V 



and 



2 ({V}a; + wi} n ; :z n V?) 2 ({V 2 a; + w 2 } n ; :z n 



ATx - K 2 

respectively. 

Proved in Appendix [C} Lemma [3] essentially states that the mutual information per dimension 
decreases with the dimensionality of the uniform transformation of the channel input. The 
following corollary is a simple extension of Lemma [3] to block-diagonal matrices. 

Corollary 2: Suppose that YL\ — diag(Vi, . . . , Vi), V_ 2 = diag(V 2 , . . . , V 2 ), and V 3 = 
diag(V 3 , . . . , V 3 ) are three random block-diagonal matrices with same number of diagonal 
blocks, where random matrices V\, V 2 , V 3 satisfies the same conditions as in Lemma [3] 
Suppose that x is independent random vectors and U\, u 2 , and u 3 are three white CSCG 
vectors with unit co variance matrices and compatible size. Then 



-^-2 (v\x + u l5 x V} 3 x + u 3 , V) > —2 (vXx + u 2 - x 



Ylx + u 3 ,V 



Furthermore, suppose (V^], V 2 [*L 3^3 1 is i.i.d. following the joint distribution of (V_ 1 ,V_ 2 ,V_ 3 ), 
then 

i-2 ({Vlx + Ul } n ; x n \{Vlx + u 3 } n ,V n ) > ^-X [{V^ 2 x + u 2 } n ; x n \{Ylx + u 3 } n ,V n ) . 
The following result is proved in Appendix [D] 
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Lemma 4: Consider following two channels with M-vector input x and fading matrices A 
and B 



y = Ax + ni 
z = Bx + n 2 



(15a) 
(15b) 



where ni ~ £/V(0, Si) and n 2 ~ £/V(0, E 2 ) are mutually independent CSGC noise, and matrix 

a] 

is isotropic. We also assume that E||a;||^ < 7. Let y G and z G be the corresponding outputs 

B 



of model ( fl"5| ) with input a? G ~ CA/"(0, -^Im), respectively. Then 
X (y;x\z, A,B) <1 (y G ; x G \z G , A, B) 



(16) 



Elog det 



Si 
S 2 



+ 



1_ 
M 



A 
B 



-Elog ^det (e 2 + -h BB ^ det Ex) . (17) 



Furthermore, if conditioned on a; n , (y[«], z[z], -B[z])" =1 are i.i.d. following the joint distri- 
bution of (y, z, A, B) conditioned on x, then 



X (y n ; x n \z n , A n , B n ) < nl (y G ; x G \z G , A, B) . 



(18) 



C. Proof of the Converse of Theorem [7] with T = 1 

We prove the converse part of Theorem [T] in the case of T = 1 in this subsection. The case 
for general T will be proved in Section |IV-D| Recall that in the channel model described in 
Section |TTJ. each receiver knows only the CSI of its own incoming links. As far as the converse 
proof is concerned, we assume both receivers are provided the CSI of all links, which can only 
enlarge the capacity region. 



The outer bounds pa) are trivial single-user bounds. We establish pb| ) next. 
At receiver 1, by Fano's inequality and Theorem [2j we have 

nRi -5 n <l(y n ;w n \H n ). 

<l(y n ;w n \H n )+nC* 



(19) 
(20) 
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where w[l], . . . w[n] denote i.i.d. white CSCG inputs, y is given by ( [TT] ) and C* is given in ([9]). 
By two different uses of the chain rule on X(y n \ x n , w n \H n ), we have 



l(y n ;w n \H n ) = l(y n ;x n \H n )+l(y n ;w n \x n ,H n ) 

X{ -~-"t~i n I ttti \ 

(y ;x \w ,H ) 

where two of the terms can be further simplified: 

X (y n ; w n \x n , H n ) = X ({H n w + Ul ) n ; w n \H n ) 

7 



and 



l(y n ; x n \w n ,H n ) =l({H 12 x + Ul } n ;x n \H n ) 



(21) 

(22) 
(23) 

(24) 



For every r, t = 1,2, we have compact SVD H rt = W rt A rt Vl t as described in Section 
where W rt and V rt consist of orthonormal columns. We can write 



IV-A 



X {{H 12 x + Ul } n ; x n \H n ) = X ({W 12 A 12 V; 2 :e + Ul } n ; x 

= l({A 12 V\ 2 x + Vl } n ;x 
>l[{V\ 2 x + v 1 } n -x n 
by Lemma [IJ where v\ = W\ 2 ux ~ CA/"(0, / m i n (M 2 ,iVi)) 9 



// 



71 Ai 



(25) 
(26) 



Ai = 2E 



log H 



det(min(J, Ai 2 )) 

and ([25]) is due to the fact that given H 12 , A 12 V\ 2 x + v 1 is a sufficient statistics of i?i 2 a; + 



for x (see, e.g., [13, Appendix A]). Collecting the preceding bounds, we have an upper bound 
on the rate of user 1: 



nRi — 5 n — nC* 



< nl(y n ;x n \H n ) +nElogdet ( / + -^-if n Jf f n 

Mi 



7 



l({V\ 2 x + Vl } n ;x r 



// 



nAi . 



(27) 
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An upper bound on the rate of user 2 is obtained by Fano's inequality and the fact that 
x — H 22 x + u 2 — z is Markovian: 

nR 2 -5 n <l{z n ;x n \H n ) 

<X({H 22 x + u 2 } n ;x n \H n ) 
< X {[V\ 2 x + v 2 } n ; x n \H n ^ + nA 2 

where ( [28] ) is by Lemma [2] with 



(28) 



A 2 = 2E log + det A 22 + 2E 



log H 



1 



det(min(J, A 22 )) 



and v 2 = W\ 2 u 2 ~ O/V(0, I 



min(A/2,A r 2) / 



The remaining discussion is on the two bounds f27| ) and ( f28[ ). In view of the three cases 
introduced in the achievability proof of Theorem [T] Cases (a) M 2 < Ni, (b) M 2 > Ni and 
Mi > Ni, and (c) M 2 > Ni > Mi, we divide the remaining proof of the converse by two parts: 
The first part investigates Cases (a) and (b) together, and the second part investigates Case (c). 



1) Proof of Cases (a) and (b): In both cases, the outer bound ( |2b| ) can be written as 

min(M 2 , Ni) 



di + 



min(M 2 , N 2 



-d 2 < min(Mi + M 2 ,N 1 ). 



(29) 



We give a proof of ( |29| ) which is similar to but much simpler than that in [12|. 

The mutual information Z(y n ,x n \H n ) is that of an isotropic fading channel with no CSIT, 
which is maximized by i.i.d. Gaussian inputs: 

'det(I+^HiiH{i + ^Hi 2 H{ 2 y 



l(y n 1 x n \H n ) < nElog 



det(/ + ^HnHl) 



(30) 



Therefore, by (27) 



nRi — 5 n — nC* — nAi 

3t(/ + -^HnHl + ^-H 12 H\ 2 )^ - 1 {{V\ 2 x + vi} n ; x n \H n ) . (31) 
The remaining task is to determine the ratio between the two remaining mutual information 



terms in ([31} and ([28]). By noting that V 22 is of M 2 x min(M 2 ,iV 2 ) and V 12 is of M 2 x 
min(M 2 , Ni) and applying Lemma [3} we have 

mm(M 2 ,N 1 



X {{V\ 2 x + vi} n ; x n 



II 



> 



min(M 2 , N 2 



^l(^{Vl 2 x + v 2 } n ;x n \H n y (32) 
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Comparing ([28]), pi) and (|3~2]) and sending n — )■ oo, we establish 



where 



* + ^tti fl2 - A £ Elog ( det(/ + + i"^>) (33) 



A .C- + A 1 + ^§§k. (34) 
mm(M 2 ,iV 2 ) 



The right hand side of ( [33] ) is the sum ergodic capacity of the MAC formed by the two transmitters 
and receiver 1. In the high SNR regime (7 — > 00), we have 

Elog (det(J + ^H 11 H\ l + ^Lh 12 H{ 2 ) \ = min(M 1 + M 2 , log 7 + o(log 7 ). 



Hence (29) is established. 



2) Proof of Case (c): In Case (c), M 2 > N x > M u (|2bJ) becomes 

di + /x(d 2 — L) < Mi (35) 

where L = Ni — Mi and 

= Ml 

^ min(M 2 , AT 2 ) — L' K } 



To establish ( [35] ), we shall use some alignment techniques developed in [20|. We first note 
that the capacity region of an interference channel depends only on the marginal distributions 
of the two received signals y and z conditioned on the inputs, and is otherwise invariant of the 
joint distribution of the outputs. Without changing the marginals of the outputs, we assume the 
following alignment in the channels and noise processes between the two users: Let V i2 (M 2 x 
Ni) consist of the last Ni columns of V 22 (M 2 x min(M 2 , N 2 )). Let also v\ = W{ 2 ui consist 
of the last N\ elements of v 2 = W\ 2 u 2 (both are i.i.d. Gaussian noise). It is important to note 
that W 12 is Ni x N x and unitary in this case. 

Let 

y = V\ 2 x + W\ 2 HnW + vi . (37) 
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We can upper bound X (y 1 ; x n \H n ) in ( |27| ) as follows^] 

l(y n ;x n \H n )=l({W\ 2 y} n ;x n \H n ) 

= X ({A 12 Vj 2 :r + W\ 2 H u w + Vl } n - x n 

where ( j40] ) is due to Lemma [2] and 



// 



(38) 
(39) 
(40) 



A 3 = 2E log + det A 12 + 2E 



log H 



det(min(J, Ai 2 )) 



Substituting ( |40| ) into ( [27] ) and noting that x — V| 2 :r + i>i — 2/ is Markovian, we can upper bound 
the rate of user 1 further: 



nR\ — 5 n — nC* — nAi — nA3 

< nE log ( det(I + ^Hnfllx)) - X ({ V^aj + x 

= log Uet(J + ^h^h)) - X ({ V f 12 x + x 
We can upper bound the rate of user 2 further by providing y as side information in ([28]): 

ni? 2 - 5 n - nA 2 < X ({V 22 x + ^ 2 } n , jT; aj n |JET 



i? n J +X(j/ n ;x n |if ,t ) 

r,# n V (4i) 



X (jT; ^|if n ) + X ({V 22 :r + v 2 } n ; x n 



y n ,H r 



(42) 



where (42) is due to the chain rule. 



In order to establish ([33]), we need to identify the ratio between the last mutual information 



terms in (gT) and (|42]), namely, X ({V^x + Ui} n ; x n y n , H n ^j andX^{V 22 x+v 2 } n ; x n 
They can roughly be interpreted as the rate loss of user 1 due to interference and the rate gain 
of user 2 by causing interference to user 1, respectively. 

Suppose that we have the following result (to be proved shortly): 



Lemma 5: Let /i be given by ( |36| ). As 7 — » 00, 
fil({Vi 2 x + v 2 } n ;x n \y n ,H n ) - X {{V\ 2 x + v x } n - x^y", H n ) < n x o(log 7 ) (43) 
where the variables are as defined in this section. 



This hinges on the crucial fact that W12 is invertible in Case (c). Because the interference plus noise, Hnw + ui, is not 
white, the equality \39\ does not hold in general if W12 is column-rank-deficient. 
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Comparing (j43j) with ( |4~Tj ) and ([42]) and sending n — > oo, we have 

Ri+yR 2 - (1 + - A - o(log7) 



<Elog rdetg+^fl-uJf^ 
<Elog Uet(I + ^-H 11 H\ 1 ] 



n 



X(y n -x n \H n ) 



(44) 



/iElog 



det(I + ^-W 12 W\ 2 + ^H U H\ 



Mr 



11; 



(45) 



where A = C* — A x — A 3 — /iA 2 and ( [45] ) is due to the fact that the mutual information 
I (y n ;x n \H n ) is maximized by i.i.d. CSGC inputs. Consider the approximation in the high- 



SNR regime JT3J: 

Elog (det(/+ jj-fluffyj = min(M 1 ,iV 1 )log7 + o(log7) 
Elog (det(I + jfW 12 W{ 2 + ^-JTuHlOj = min(M 1 + M 2 , jVx) log 7 + o(log T ) 



Dividing both sides of ( |45] ) by log (1 + 7) and letting 7 — > 00, we obtain 

di + fid 2 < min(Mi, JVi) + /i[min(Mi + M 2 , N ± ) - min(M 1 , iVi 



which reduces to (35) under the assumption of M 2 > Ni > M\. 



The remaining task is to verify that ( [43] ) holds. 

Proof of Lemma ^ By noting that x — V 22 a; + v 2 — V 21 x + v 1 — y is a Markov chain (due 
to the alignment), we have 



yX [{V\ 2 x + v 2 } n ; x n \y n , H n ) - 1 ([V{ 2 x + v 1 } n ] x n 
= yX [{V\ 2 x + v 2 } n ; x n \H)-X [{V\ 2 x + Vl } n - x n 



y n ,H 

H n ) + (1 - n)T (y n ; x n 



11 



(46) 



Intuitively, the interference in signal y caused by Hnw is much stronger than noise in high 
SNR regime. However, since N± > Mi, the interference H n w only occupies an Mi-dimension 
subspace. We want to show that this subspace, which contributes no DoF, can be isolated from 
the Ni -dimension received signal space so that the remaining (Ni — Mi) -dimension subspace 
can be used by user 2 without interference. 

Conditioned on H, Hnw ~ CA/"(0, ■^■H n H{ 1 ) in ([37]) is a Gaussian random vector. 
Consider the compact SVD Hu = WuAnVy, where An is an Mi x Mi diagonal matrix, 
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whose diagonal elements are strictly positive with probability 1. We can append orthogonal 



columns to Wn to form a unitary matrix W = [Wn, Wn]. Evidently, the term Huw in ( |37| ) 
can be rewritten as 

^11 



if n ™ = VV 







V\ x w 



Let us define 



v[ 2 = w^w 12 v\ 2 

vi = W ] W l2 v l 



(47) 

(48) 
(49) 



where v\ ~ £/V(0, J) is independent of (W, W12). Furthermore, the Ni x M 2 matrix V12 can 



12,i 



12,-R 



, where V12 l consists of 



be expressed in terms of its sub-matrices as V12 = 
first Mi columns and V\2,r consists of the remaining iVi — Mi columns. Also, let Vi )U consist 
of the first Mi elements in v\ and v ljd consist of the remaining Ni — Mi elements. We have 



1 (y n - x n H n ^j = 1 ({W^Wi 2 y} n ; x r 



II 



1 



v\ 2 x + vi 



A u V{iW 


-t 



x 



n 



(50) 
(51) 



I {Vi2, L x + + A n V T n ™} n , { V 12 + u M } n ; :z n 



Z {{VioR x + v 1)d r;x r 



H' 



1 {Vi 2 , L x + « M + A 1 iV T ll <; x n {V' 12:R x + vi >d } n , H n (52) 



where §52\ is due to the chain rule. We next invoke Lemma [4] on the conditional mutual 



information in (p2|) with A = V 12 ./> B = V 12R , and the noise covariance matrices 



£1 = cov jwi,,, + AnVjiiy} = / + ^A 



and £ 2 = /. As a result, (52) is upper bounded: 



# n <X {V; 2ijR a: + 5; M } n ;a: n 



7 



if™ - nElog det --I + 7 det I + --A 



7 



M 



L ii 



+ nElog(det(X A?1+J + ^ J ) de t(^I + / 
Z ( {V[ 2iR x + «i,4 n ! ^" H n ) + nE log det ( / + ( A? x + y J 



(53) 



I { Vi 2 ^ + £ M } n ; ."F +nx o(log 7) • 



(54) 
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Let us also define Vi 2 = [V^.l, Vi2,r] where V 12j l consists of the first Mi columns. Then 



V\2 t R and Vu,R are identically distributed. The upper bound ( |54| ) can thus be rewritten as 

X (y n ; x n \H n ~) < X {[V\ 2R x + v 14 } n - x n \H n ^j + n x (log 7 ) . (55) 
where Vi >c i consists of the first Mi elements of v\ and is identically distributed as Vi 4 . 



Substituting (55 ) into (46 ), it suffices to show the following inequality in order to establish (43 ): 



H n < 0. (56) 



(jlE {[V\ 2 x + v 2 } n ; ^|iP) - X {{V\ 2 x + Vl } n ; x n \w 

+ {l-^)x{{V\ 2tR x + v 14 } n -x n 

Recall that V 12 (M 2 x N x ) consists of the last Ni columns of V 22 {M 2 x min(M 2 , N 2 ) due to the 
assumed alignment. Hence V 22 contains all the jVj — Mi columns of Vi 2j r and we can write 
V 22 = [V 22} l Vi2,r], where V 22: l consists of the first p = min(M 2 , A^ 2 ) — (Ni — Mi) columns 
of V 22 . 

Furthermore, the first p elements in v 2 as v 2)U - The remaining part of v 2 is Vi^ due to the 
alignment assumption. Therefore, the left hand side of ([56]) is equal to 

/iX {{V\ 2 x + v 2 } n - aj n |{Vl 2jJl aj + v 14 } n , if") - X ({V f 12 aj + Vi} n ; x n \{V\ 2fi x + w M } n , H n 
= ({Vl 2 , L x + v 2 , u } n ; x n \{V\ 2>R x + vi, d } n , H n ) 

- X {{V\ 2 L x + v hu } n ; x n | {V\ 2 R x + u M r, JJ n ) . (57) 

Note that p, = Mi/p and (Vi2,l, ^22,l, T^i2,r) satisfy the conditions of (Vi,V 2 ,V 3 ) in 
Lemma [3} That is, conditioned on Vi 2j r, the matrices Vi2,l and V 22j ^ are uniformly distributed 



in the respective subspaces orthogonal to Vi 2 ,k- Therefore, ( |56| ) follows by applying Lemma [3 
to ( |57| ). Thus ((43]) is established and so is Theorem [T] ■ 



D. Proof of the Converse of Theorem [7] with general T 

The proof of the general case with coherence time T is similar to that of the special i.i.d. case 
(T = 1). Without loss of generality, we consider the time period from 1 to nT. By stacking the 
transmitted signals and noise terms at time slots i = (J — 1)T + 1, . . . ,jT into longer vectors 
w[j\i u x \j\, and u 2 [j], respectively, for j = 1, . . . , n, The model ([T]) with coherent time T 
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can be rewritten as 

y[j] = K^jMj] + H 12 [j]x[j) + uM (58a) 
z[j] = H 21 \i]w\j] + H 22 [j]x[j] + u 2 \j] (58b) 

for j = 1, . . . , n, where for every (r, t, j), H rt [j\ is an independent block diagonal matrix with 
identical diagonal blocks, i.e., H rt [j] = dmg(H rt [jT], . . . , H rt [jT]). 



Therefore, the general case can be shown by using the equivalent channel ( [58] ) and following 
the exact same steps of the proof for case of T = 1, where application of Lemmas [T] and [3] 
should be replaced by the corresponding corollaries [TJ and [2} The DoF region turns out to be 
identical as that of the case of T = 1. 

V. Concluding Remarks 



We have fully characterized the degree-of-freedom region of the two-user isotropic fading 
MIMO interference channels without channel state information at transmitters. In particular, we 
show that two users can use independent Gaussian single-user codebooks to achieve the entire 
DoF region. This suggests structured signaling schemes such as beamforming and interference 
alignment cannot provide additional gains in the high-SNR regime, although the exact capacity 
region remains open. 

Our result only applies to two-user interference channels with i.i.d. block fading, where 
the physical links have the same coherent time and aligned coherence blocks. Without CSI 
at transmitters, interference alignment might still provide additional gain beyond this particular 
channel model. For example, in | [T6) , the author shows that for channel with antenna configuration 
(Mi, Ni, M 2 , N 2 ) = (1, 2, 3, 4), as depicted in Fig. [Tj if the coherent times of receiver l's direct 
link and cross link are different (say, 1 and 2, respectively), the DoF pair (1, 1.5) can be achieved 
through interference alignment, while this DoF pair is excluded from the region developed in 
Theorem [TJ 

Appendix 

A. Proof of Theorem [2] 



The follow result is shown in [19| 
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Lemma 6 ( / [TP] Lemma 1]): Let (it, v, w) be any real- or discrete-valued mutually indepen- 
dent random variables. Then 

X (w + v; w) < X (w + it; w) + X (it + v; it) . (59) 



Following a similar procedure as in [19|, we can show 



X (Hw + v; Hw) < X (Hw + Hw; Hw) + X (Hw + v; Hw) (60) 

where (w,v,w) are mutually independent complex- valued random vectors and H is a determined 
matrix. Moreover, Hw is a sufficient statistics of w for iJit> + i; and Hw + ffiD; and Hw is 



a sufficient statistics of w for /fiu + v. Hence (60) is equivalent to: 



X (Hw +v;w) <X (Hw + Hw; w) + X (Hw + v;w) . 

By noting that E||it>|| 2 < 7, ([7]) is established. 

In the case of w ~ CAf(0, jjl), we need to show that 

C = sup X (Ha + M; a) = C* 

E||a|| 2 <7 

where C* is given in @. Consider the (full) SVD H = WDV\ where D is iV x M nonnegative 
and diagonal matrix, and W and V are iV x iV and M x M unitary matrix. We have 

C = sup X (Da' + Dw; a) . 

E||a'|| 2 <7 

where a' = V^a. We observe that a' \-t Da' + Dw is exactly min(M, TV) parallel Gaussian 
channels with the same gains. It is not difficult to see that 

Thus, ([8]) is established. 

For channel ([T]), by stacking w n and {Hi 2 x + Ui} n into two vectors of length nM\ and nN%, 
respectively, and applying ([8]) with channel matrix diag(i?ii[l], . . . ,Hu[n]), we obtain pO] ) if 



H n is constant. Averaging over the distribution of H n yields the general result (10). 
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B. Proof of Lemma [2] 



Since the two sides of ( fT2[ ) are expectations over the joint distribution of (Ai, A 2 ), it suffices 



to show that for each realization of the matrices, denoted by (A 1; A 2 ) 

X(AiX + u;x) —X (A 2 x + u; x) 



21 ° g ' det A min ) ^ 
>-2^ m -2^(^L-) (62) 



where A min = min(A 1 ,A 2 ) > 0. 

By data process inequality [21, Chapter 2], 



I(AiX + u;x) — X (A 2 x + u; x) 

> X (A min x + it; x) - X (A 2 x + u; x) 

= X (A 2 x + A 2 A^ m u; x) - X {A 2 x + u; x) . (63) 

Let Tj u be the covariance matrix of u and u' be an independent CSCG random vector with 
covariance A 2 A~ 1 1 n E lt A~[ n A 2 — S u (which is evidently positive semi-definite). Then ( [63] ) can be 
further written as 

X(A\X + u; x) —X (A 2 x + u; x) 

= X (A 2 x + u + u ; x) — X {A 2 x + u; x) 

= -X(A 2 x + u; x\A 2 x + u + v!) (64) 
where d64l) is because x — A 2 x + u — A 2 x + u + u' is Markov. Therefore, it boils down to upper 



bounding the mutual information in (64): 



X(A 2 x+u; cc|A 2 cc + u + u) 

= X (it'; u + u\A 2 x + u + it') 

< J («';« + «') (65) 

, n , detA 2 
2 log 



v det A m 
< 2 log + det A 2 + 2 log H 



det A m ; n 
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where in d65l) we have used the fact that u' — u + u' — A 2 x + u + v! forms a Markov chain. We 



have thus established (|62j). Lemma [2] follows by taking the expectation on both sides. 



C. Proof of Lemma [J] 

Let a random vector x and another random object v have a joint distribution. Define the 
minimum mean-square error (MMSE) of estimating x conditional on v and yt x + u, where 
u ~ CJ\f(Q,I) is independent of (x, v) as 

x — E\x\Vi 



mmse (as ; tlv) = E 



(66) 



(67) 



We have the following formula that relates the MMSE and mutual information [22|: 

X {\/t x + u\ x\v s j = J mmse (x; r\v) dr 

Find an arbitrary orthonormal basis in space C A2 , say, {e^f" 2 ; then construct K 2 subsets of 
{ei}f 2 such that each subset has K\ elements and each is included in exact K\ subsets; each 
subset corresponds to a K\ x A 2 matrix, called Bi, . . . , Bk 2 - Then we see that BjBj = Ik x for 
all j = 1, . . . , K 2 and J2f=i BjBj = ^k 2 - Therefore, for any v and z 

1 K ' 2 

— mmse (Bjz ; t\v) 
1 j=i 



A' 2 



> 



A" 



Bj-z - E 



,z — E 



z\\ftz + U 2} V 

z\\ftZ + 1t2, 1> 



5jZ | \/i Bjz + BjU 2 , v 



Bjz\\ft z + u 2 , v 



Bjz - E 



E 



z - E 



Ax 



^2 



J2 Bl i B ' 



z-E 



Bjz\\/t z + u 2 , v 
z\\/t Z + U 2 , V 



(68) 



E 



z — E 



E \ z\ \ft Z + W2, v 



mmse 



(69) 



where ( |68| ) is due to the fact that we have better estimation with better observation. Letting 
z = V 2 x and v = (v\x + u 3 , V^j in ( |69] ), we have 



1 ^ 2 

— mmse ^.BjV^a; ; 1 1 T^gtc + u 3 , V^j > mmse (v\x ; t| Vga; + U3, Vj 

1 i=i 



(70) 
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Furthermore, P_b,v 2 |v 3 and Pvi|v 3 are uniform distributions on V\ by assumption, hence 
(B3V2, V3) and (Vi, V3) are identically distributed. Therefore, 

#1 



?z(V 1 « + « I ;«|vk e+ «,,v) 

1 ^ 

- (Sj-Vja: + m; a; V 3 :z + u 3 , v) 

1 i=i 



mmse ( BfV 2 x ; i 



> 



mmse ( V^as ; £ 



Vlaj + ua, Vldt 



IlVlx 



u 2 ; x 



Vlx + u 3 ,V 



(71) 

(72) 
(73) 



where fTT] ) and ( f73| ) are due to ( [67] ), and (72) is due to < f70| ). We have thus established ( fT3] ). 



To show ( fT4| ), we stack cc[l], . . . , x[n) into a vector aj of size nM, stack itj[l], . . . , Uj[n] into 



a vector Itj of size nNj for j = 1,2, and construct random matrix Vj = diag(Vj[l], . . . , Vj[n]) 
for j = 1,2. Then the sequence {Vt[i]aj[i] + Uj [i] can be represented as VjX + Uj. Let 
Bj = diag(Bj, . . . , Bj). It is easy to see that BjBj = I n K x and ^- Ylfjli BjBj = InK 2 - Although 
Vj are not uniformly distributed, it is still true that (BjV 2 , V 3 ) and (Vi, V3) have identical 



distribution. Therefore, (14) follows by similar arguments as in above. 



D. Proof of Lemma [?] 

The equality ( fTVj ) is straightforward. We focus on the inequality ( fT6] ). 

Consider the eigenvalue decomposition of the noise variance Ei = WxAtW}, then y' = 
WtA-^y = A'x + n[, where n[ = WiA± 1/2 n x ~ CN(0, 1) and A' = WiK[ l/2 A, which is 
still isotropic. Also, y' is a sufficient statistics of y. Therefore, applying ( |67| ) with v = (z, A' , B), 
we have 



I(y;x\z,A,B) = I (y';x\z,A',B) 

= J mmse (^A'x; t z, A', B ) dt. 
Note that A' is still isotropic by Definition 1. 



(74) 
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Given A' = A' and B = B, the MMSE in {74]) can be expressed as 

mmse (A'x; t\z) = mmse (A'x; t\Bx + n 2 ) (75) 

, [ ly/iA'] [nil] 
= E A'x-A'E x x+ 1 (76) 

B n 2 

which is the MMSE of A'x conditioned on a linear transformation of x with additive Gaussian 
noise. Let the covariance of x be Q = cov {x}. Let Xq ~ £/V(0, Q) be Gaussian with the same 
covariance. Then the MMSE ( [75] ) cannot decrease if the input x is replaced by xq, i.e., 

mmse (A'x; t\z) < mmse (A'xq; t\zo) (77) 

holds for every t > 0, where Zq = -Bjeq + n 2 . The reason is that the estimator that minimizes 
the MMSE for A'xq is linear, which also achieves the same MMSE if applied to A'x. This 
implies that using the optimal (nonlinear) estimator for A'x can only yield a smaller MMSE. 



Plugging f77[ ) into ([74]), we see that, in order to maximize the mutual information I (y; x\z, A, B), 
it suffices to restrict the input vector on the set of Gaussian random vectors, i.e., it boils down 
to finding the covariance matrix Q that maximizes the mutual information. As we shall see, the 
optimal Q is (j/M)Im- 

Consider the eigenvalue decomposition Q = UAW. Then Wxq consists of independent 
entries. Due to the isotropy of A' and B, the statistics of A'Wxq and BWxq are identically 
distributed as A'xq and Bx Q , respectively. Hence the MMSE is invariant to the eigenvectors 
of Q. Therefore, the maximization problem can be further restricted to all Gaussian Xq with 
independent entries, i.e., Q is diagonal. 

To maximize the mutual information, the diagonal entries of Q must all be equal: Let n be 
the collection of all Ml permutation matrices for the M-dimension linear space. By isotropy of 
A and the concavity of conditional MMSE, we have 



mmse (^A'xq; t 



zo,A',B 



= ^E mmse ( A ' x n Q w,t 

' ne7r 

< mmse (A'xr\zr, A', B) 



z tiqw,A',B 



where 



R 



(78) 



1IG7T 
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have identical diagonal entries. Therefore, to maximize the mutual information, we can further 
restrict the optimization problem to be on Gaussian i.i.d. inputs. In other words, 

X(y;x\z,A,B) <I(Ax pI + ni;x pI \z pI ,A,B) (79) 

for some p < 7/M. 

Finally, we show that the maximum mutual information is achieved by p = 7/M. Suppose 
otherwise, i.e., p < 7/M. For convenience, denote x p i by x p . Let x ~ £/V(0, (7/M — p)Im) 
be independent of x p . Then x 1 /m = x p + x. Given A = A and B = B, 

X (Ax p + rei; x p \z p ) = X (A(x p + x) + n x ; x p + x\B(x p + x) + n 2 , x) 

= X (Ax 1 / M + ni, x 1 / M \Bx 1 / M + n 2 , x) 

< 1 [Axry/M + nr, x^/m, x\Bx 1 / M + n 2 ) (80) 
= X (Ax l/M + m; x l/M \Bx 7 / M + n 2 ) + X (Aa3 7/M + x\Bx l/M + n 2 , jb 7 /m) 
= X (Acc 7 /j^ + ni; x 1 / M \Bx 1 / M + n 2 ) + X (n x ; x|n 2 , ^ 7 /m) 
= X (Aa; 7/ Af + ni; x 7 / M \Bx 7 / M + n 2 ) (81) 
where in ( [80] ) is due to chain rule and ( |8Tj ) is due to independence of the signals and the noises. 



Similarly, ( fT8| ) can be proved by stacking the sequences of vectors into larger vectors. 
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